Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Off-chain] feat: in-memory query cache(s) #994

Open
wants to merge 39 commits into
base: main
Choose a base branch
from

Conversation

bryanchriswhite
Copy link
Contributor

@bryanchriswhite bryanchriswhite commented Dec 11, 2024

Summary

Adds the QueryCache[T any] and HistoricalQueryCache[T any] interfaces, InMemoryCache[T any] implementation, configurations, and options functions.

---
title: Legend
---

classDiagram-v2

class GenericInterface__T__any {
    <<interface>>
    GenericMethod() T
}

class Implemenetation {
    ExportedField FieldType
    unexportedField FieldType
}

Implemenetation --|> GenericInterface__T__any: implements

%% class Embedder__T__any {
%%     <<interface>>
%%     GenericInterface[T]
%% }

%% Embedder__T__any ..|> GenericInterface__T__any: embeds
Loading
---
title: Caches
---

classDiagram-v2


class KeyValueCache__T__any {
    <<interface>>
    Get(key string) (value T, isCached bool)
    Set(key string, value T) (err error)
    Delete(key string)
    Clear()
}

class HistoricalKeyValueCache__T__any {
    <<interface>>
    GetLatestVersion(key string) (value T, isCached bool)
    GetVersion(key string, version int64) (value T, isCached bool)
    SetVersion(key string, value T, version int64) (err error)
}

class keyValueCache__T__any:::cacheImpl
keyValueCache__T__any --|> KeyValueCache__T__any

class historicalKeyValueCache__T__any
historicalKeyValueCache__T__any --|> HistoricalKeyValueCache__T__any
Loading

Issue

Type of change

Select one or more from the following:

Testing

  • Documentation: make docusaurus_start; only needed if you make doc changes
  • Unit Tests: make go_develop_and_test
  • LocalNet E2E Tests: make test_e2e
  • DevNet E2E Tests: Add the devnet-test-e2e label to the PR.

Sanity Checklist

  • I have tested my changes using the available tooling
  • I have commented my code
  • I have performed a self-review of my own code; both comments & source code
  • I create and reference any new tickets, if applicable
  • I have left TODOs throughout the codebase, if applicable

@bryanchriswhite bryanchriswhite added the off-chain Off-chain business logic label Dec 11, 2024
@bryanchriswhite bryanchriswhite self-assigned this Dec 11, 2024
@bryanchriswhite bryanchriswhite linked an issue Dec 11, 2024 that may be closed by this pull request
4 tasks
@bryanchriswhite bryanchriswhite marked this pull request as ready for review December 12, 2024 11:20
* pokt/main:
  [Relayminer, Bug] fix: sessiontree logger never initialized (#993)
  fix: E2E tests - RPC URL path (#1008)
  Updated cheat sheat docs with an example after installation (#1004)
  fix: Nil session tree logger (#1007)
@bryanchriswhite bryanchriswhite changed the base branch from fix/sessiontree to main December 13, 2024 14:24
Copy link
Member

@Olshansk Olshansk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bryanchriswhite I did a superficial review but did not dive into the validation of the business logic line-by-line.

Is there any section where you'd want another pair of 👀 ?

Co-authored-by: Daniel Olshansky <olshansky.daniel@gmail.com>
Copy link
Member

@Olshansk Olshansk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the iteration. Thanks @bryanchriswhite!

Left a few new comments, a few replies to older threads, but it should be g2g after the next round 🙌

Copy link
Contributor

@red-0ne red-0ne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great caching design.

Left some comments but did not tend to pkg/client/query/cache/memory_test.go yet. Which I'll do right after this one.

@Olshansk Olshansk marked this pull request as ready for review February 12, 2025 00:48
Copy link
Contributor

@red-0ne red-0ne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left few comments but it feels like the current implementation is calling for having an interface and implementation for each type of cache (historical, non-historical).

It will remove a lot of conditional branching and runtime errors.

@bryanchriswhite
Copy link
Contributor Author

bryanchriswhite commented Feb 20, 2025

As discussed, reorganizing like so:

image

image

image

Copy link
Contributor

@red-0ne red-0ne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shape is easier to read and reason about 👍

I left some comments but should be trivial to address.

Clear()
}

// HistoricalKeyValueCache extends KeyValueCache to support getting and setting values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HistoricalKeyValueCache no longer extend KeyValueCache :)

Comment on lines +51 to +55
// Get retrieves the value from the cache with the given key. If the cache is
// configured for historical mode, it will return the value at the latest **known**
// version, which is only updated on calls to SetAsOfVersion, and therefore is not
// guaranteed to be the current version w.r.t the blockchain.
func (c *keyValueCache[T]) Get(key string) (T, bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Get retrieves the value from the cache with the given key. If the cache is
// configured for historical mode, it will return the value at the latest **known**
// version, which is only updated on calls to SetAsOfVersion, and therefore is not
// guaranteed to be the current version w.r.t the blockchain.
func (c *keyValueCache[T]) Get(key string) (T, bool) {
// Get retrieves the value from the cache with the given key.
func (c *keyValueCache[T]) Get(key string) (T, bool) {

}

// Set adds or updates the value in the cache for the given key.
func (c *keyValueCache[T]) Set(key string, value T) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it still make sense to return an error here?


import "cosmossdk.io/errors"

const codesace = "client/query/cache"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const codesace = "client/query/cache"
const codespace = "client/cache"

const codesace = "client/query/cache"

var (
ErrKeyValueCacheConfigValidation = errors.Register(codesace, 3, "invalid query cache config")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ErrKeyValueCacheConfigValidation = errors.Register(codesace, 3, "invalid query cache config")
ErrKeyValueCacheConfigValidation = errors.Register(codesace, 1, "invalid query cache config")


// getLatestVersionNumber returns the latest version number (not the value) of the given key.
func (c *historicalKeyValueCache[T]) getLatestVersionNumber(key string) int64 {
c.valuesMu.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a RLock, especially that it's used in GetLatestVersion?

}

// Update sortedDescVersions and ensure the list is sorted in descending order.
if _, versionExists := valueHistory.versionToValueMap[version]; !versionExists {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wdyt about returning an error if SetVersion tries to set a different value to an already existing historical value?

I don't think the following could be a valid use case:

cache.SetVersion(sameKey, value, sameVersion)
cache.SetVersion(sameKey, differentValue, sameVersion)

}

// Validate ensures that the historicalKeyValueCacheConfig isn't configured with incompatible options.
func (cfg *historicalKeyValueCacheConfig) Validate() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't historicalKeyValueCacheConfig#Validate also call keyValueCacheConfig#Validate ?

// returns an error.
func (c *historicalKeyValueCache[T]) SetVersion(key string, value T, version int64) error {
// DEV_NOTE: MUST call getLatestVersionNumber() before locking valuesMu.
latestVersion := c.getLatestVersionNumber(key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think getLatestVersionNumber should be under the lock of the calling method.

In order to secure that the latest version is not changed between the getLatestVersionNumber call and usage of the latestVersion in the caller.

This would require getLatestVersionNumber itself to not call Lock or RLock.

It would also involve some refactoring for GetLatestVersion and GetVersion though.

require.False(t, isCached)
})

t.Run("historical cache ignores TTL expiration", func(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historical cache is still subject to maxKeys FIFO eviction policy.

I believe it's worth test covering it.

Copy link
Member

@Olshansk Olshansk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized I never published my review from yesterday :(

// value versions are pruned.
// E.g.: Given a latest version of 100, and a maxVersionAge of 10, then the
// oldest version that is not pruned is 90 (100 - 10).
// If 0, no historical pruning is performed. It ONLY applies when historical is true.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If 0, no historical pruning is performed. It ONLY applies when historical is true.
// If 0, no historical pruning is performed.
// ONLY applies when historical is true.

historical bool
// maxVersionAge is the max difference between the latest known version and
// any other version, below which value versions are retained, and above which
// value versions are pruned.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are they pruned or just not considered cache hits? #PUC

If they're evicted, please make that explicit.

@@ -0,0 +1,378 @@
package cache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you a new page in the docs with the diagram you have?

No need to add text or detail, but just want to make sure the diagram is not forgotten and the foundation to add details in the future is in place.

Comment on lines 45 to 48
// cacheValueHistory stores cachedValues by version number and maintains a sorted
// list of version numbers for which cached values exist. This list is sorted in
// descending order to improve performance characteristics by positively correlating
// index with age.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// cacheValueHistory stores cachedValues by version number and maintains a sorted
// list of version numbers for which cached values exist. This list is sorted in
// descending order to improve performance characteristics by positively correlating
// index with age.
// cacheValueHistory maintains:
// - Cached values indexed by version number
// - A descending-sorted list of version numbers for existing cached values
//
// The descending sort order optimizes performance by correlating index with age.

Comment on lines 78 to 81
// Get retrieves the value from the cache with the given key. If the cache is
// configured for historical mode, it will return the value at the latest **known**
// version, which is only updated on calls to SetAsOfVersion, and therefore is not
// guaranteed to be the current version w.r.t the blockchain.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Get retrieves the value from the cache with the given key. If the cache is
// configured for historical mode, it will return the value at the latest **known**
// version, which is only updated on calls to SetAsOfVersion, and therefore is not
// guaranteed to be the current version w.r.t the blockchain.
// Get retrieves a value from the cache using the provided key.
//
// For historical mode:
// - Returns the value at the latest known version
// - Latest version is only updated via SetAsOfVersion
// - No guarantee of returning current blockchain version

Comment on lines 99 to 103
// DEV_NOTE: Intentionally not pruning here to improve concurrent speed;
// otherwise, the read lock would be insufficient. The value will be
// overwritten by the next call to Set(). If usage is such that values
// aren't being subsequently set, maxKeys (if configured) will eventually
// cause the pruning of values with expired TTLs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// DEV_NOTE: Intentionally not pruning here to improve concurrent speed;
// otherwise, the read lock would be insufficient. The value will be
// overwritten by the next call to Set(). If usage is such that values
// aren't being subsequently set, maxKeys (if configured) will eventually
// cause the pruning of values with expired TTLs.
// DEV_NOTE: Not pruning here to optimize concurrent speed:
// - Read lock alone would be insufficient for pruning
// - Next Set() call will overwrite the value
// - If values aren't subsequently set, maxKeys config will eventually trigger
// pruning of TTL-expired values

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
off-chain Off-chain business logic
Projects
Status: 👀 In review
Development

Successfully merging this pull request may close these issues.

[Off-Chain] ModuleParamsClient & Historical Params
3 participants